Structure and modeling of the network of two-Chinese-character compound words in the Japanese language

نویسندگان

  • Ken Yamamoto
  • Yoshihiro Yamazaki
چکیده

Abstract This paper proposes a numerical model of the network of two-Chinese-character compound words (two-character network, for short). In this network, a Chinese character is a node and a twoChinese-character compound word links two nodes. The basic framework of the model is that an important character gets many edges. As the importance of a character, we use the frequency of each character appearing in publications. The direction of edge is given according to a random number assigned to nodes. The network generated by the model is small-world and scale-free, and reproduces statistical properties in the actual two-character network quantitatively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Network of two-Chinese-character compound words in Japanese language

Some statistical properties of a network of two-Chinese-character compound words in Japanese language are reported. In this network, a node represents a Chinese character and an edge represents a two-Chinese-character compound word. It is found that this network has properties of “small-world” and “scale-free.” A network formed by only Chinese characters for common use (joyo-kanji in Japanese),...

متن کامل

Word-Forming Process in Azeri Turkish Language

The subject intended to study the general methods of natural word-forming in Azeri Turkish language. This study aimed to reach this purpose by analyzing the construction of compound Azeri Turkish words. Same’ei (2016) did a comprehensive study on word-forming process in Farsi, which was the inspiration source of this study for Azeri Turkish language word-forming. Numerous scholars had done vari...

متن کامل

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

Handwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns

The purpose of this study is to analyze the performance of Back propagation algorithm with changing training patterns and the second momentum term in feed forward neural networks. This analysis is conducted on 250 different words of three small letters from the English alphabet. These words are presented to two vertical segmentation programs which are designed in MATLAB and based on portions (1...

متن کامل

The Source of Human Knowledge: Plato’s problem and Orwell’s problem

Chomsky cannot help wondering at the fact that we, despite so vast evidence, have little knowledge about the obvious evidence. A good example, I think, is the child’s way of first language acquisition. A great many researchers have studied various aspects of child language acquisition at different stages of the child’ life and have brought to light many details of language development. However,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1405.2167  شماره 

صفحات  -

تاریخ انتشار 2013